All Questions
11 questions
3votes
1answer
37views
Confirm understanding of decision_function in Isolation Forest sklearn
I am looking to better understand sklearn IsolationForest decision_function. My understanding is that if the metric is closer to -1 then the model is more confident ...
0votes
1answer
75views
detecting abnormality in a specific feature with respect to others (unsupervised?)
I have a large dataset with a feature y which is dependent in part on features x1 and x2. All features are noisy, and y is also dependent on other parameters not captured in the dataset. I would like ...
0votes
0answers
111views
Understanding Isolation Forest predictions
I'm running sklearn's IsolationForest on a dataset containing 2 classes of data, one that I know is the anomaly (~1.5% of the entire dataset), the other is the normal dataset. I'm using this (shuffled)...
2votes
3answers
406views
K-Means anomaly detection not clustering anomalies
K-means anomaly detection scatter plot The following code, takes a single column from a dataset and then adds 50 anomalies to the dataset that is quite bigger than the maximum values of the dataset. ...
2votes
1answer
416views
Adding anomalies to the Dataset
Recently I have been trying different Scikit-Learn anomaly detection clustering methods, like DBSCAN Isolation Forest. Based on how many training data I use, how I tweak on the algorithms ...
2votes
1answer
80views
Functions in scikit that detect outliers automatically?
I know a way to visualize outliers is to make a box plot, but wanted to know if scikit had any quick ways to detect outliers for each variable?
1vote
2answers
10kviews
How can I replace outliers with maximum non-outlier value?
I am doing univariate outlier detection in python. When I detect outliers for a variable, I know that the value should be whatever the highest non-outlier value is (i.e., the max if there were no ...
10votes
3answers
15kviews
Isolation forest sklearn contamination param
I am working on an unsupervised anomaly detection task on time series data using an isolation forest algorithm. I am developing it in Python, more in detail using ...
2votes
2answers
1kviews
How can I find anomalies in each row of data?
I have some reported data I want to spot anomalies on. The columns are a facility name then monthly reports of that given facility. ...
5votes
1answer
3kviews
Isolation Forest Feature Importance
As of scikit-learn version 0.19.1, there is no implementation for calculating feature importance in an Isolation Forest. I'm also having trouble finding any online resources proposing ways to get at ...
4votes
1answer
6kviews
Multivariate outlier detection with isolation forest..How to detect most effective features?
I am trying to detect outliers in my data-set with 5000 observations and 800 features. I have followed the simple steps told in http://scikit-learn.org/stable/auto_examples/ensemble/...